A Phase Vocoder Model of the Glottis for Expressive Voice Synthesis

نویسنده

  • Eduardo Reck Miranda
چکیده

Abstract In this paper we explain how we are improving the source component of a source-filter vocal synthesis system. Our strategy for this improvement involves the replacement of the pulse generator by a phase vocoder module whose coefficients are derived from the analysis of speech signals. Firstly, we introduce the context of our research and then indicate the problem; finally, we present our solution followed by our conclusions and suggestions for further work.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Pulse Model in Log-domain for a Uniform Synthesizer

The quality of the vocoder plays a crucial role in the performance of parametric speech synthesis systems. In order to improve the vocoder quality, it is necessary to reconstruct as much of the perceived components of the speech signal as possible. In this paper, we first show that the noise component is currently not accurately modelled in the widely used STRAIGHT vocoder, thus, limiting the v...

متن کامل

The NII speech synthesis entry for Blizzard Challenge 2016

This paper decribes the NII speech synthesis entry for Blizzard Challenge 2016, where the task was to build a voice from audiobook data. The synthesis system is built using the NII parametric speech synthesis framework that utilizes Long Short Term Memory (LSTM) Recurrent Neural Network (RNN) for acoustic modeling. For this entry, we first built a voice using a large data set, and then used the...

متن کامل

Residual-Based Excitation with Continuous F0 Modeling in HMM-Based Speech Synthesis

In statistical parametric speech synthesis, creaky voice can cause disturbing artifacts. The reason is that standard pitch tracking algorithms tend to erroneously measure F0 in regions of creaky voice. This pattern is learned during training of hidden Markov-models (HMMs). In the synthesis phase, false voiced / unvoiced decision caused by creaky voice results in audible quality degradation. In ...

متن کامل

Effectiveness of fundamental frequency (F0) and strength of excitation (SOE) for spoofed speech detection

Current countermeasures used in spoof detectors (for speech synthesis (SS) and voice conversion (VC)) are generally phase-based (as vocoders in SS and VC systems lack phaseinformation). These approaches may possibly fail for nonvocoder or unit-selection-based spoofs. In this work, we explore excitation source-based features, i.e., fundamental frequency (F0) contour and strength of excitation (S...

متن کامل

GlottDNN - A Full-Band Glottal Vocoder for Statistical Parametric Speech Synthesis

GlottHMM is a previously developed vocoder that has been successfully used in HMM-based synthesis by parameterizing speech into two parts (glottal flow, vocal tract) according to the functioning of the real human voice production mechanism. In this study, a new glottal vocoding method, GlottDNN, is proposed. The GlottDNN vocoder is built on the principles of its predecessor, GlottHMM, but the n...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999